FudanNLP: A Toolkit for Chinese Natural Language Processing
نویسندگان
چکیده
The growing need for Chinese natural language processing (NLP) is largely in a range of research and commercial applications. However, most of the currently Chinese NLP tools or components still have a wide range of issues need to be further improved and developed. FudanNLP is an open source toolkit for Chinese natural language processing (NLP), which uses statistics-based and rule-based methods to deal with Chinese NLP tasks, such as word segmentation, part-ofspeech tagging, named entity recognition, dependency parsing, time phrase recognition, anaphora resolution and so on.
منابع مشابه
Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization
In this paper, we give an overview for the shared task at the CCF Conference on Natural Language Processing & Chinese Computing (NLPCC 2017): Chinese News Headline Categorization. The dataset of this shared task consists 18 classes, 12,000 short texts along with corresponded labels for each class. The dataset and example code can be accessed at https://github.com/FudanNLP/ nlpcc2017_news_headli...
متن کاملOverview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Segmentation for Micro-Blog Texts
In this paper, we give an overview for the shared task at the 5th CCF Conference on Natural Language Processing & Chinese Computing (NLPCC 2016): Chinese word segmentation for micro-blog texts. Different with the popular used newswire datasets, the dataset of this shared task consists of the relatively informal micro-texts. Besides, we also use a new psychometric-inspired evaluation metric for ...
متن کاملFudanNLP at RITE 2011: a Shallow Semantic Approach to Textual Entailment
RITE is a task recognizing logic relations between texts. This paper presents FDCS’s approach on NTCIR9-RITE Chinese simplified BC & MC subtasks. Our system is built on a machine learning architecture with features selected on shallow semantic methods, including named entity recognition, date & time expression extraction, words overlap and negation concept recognition. FudanNLP is wildly used b...
متن کاملNiuParser: A Chinese Syntactic and Semantic Parsing Toolkit
We present a new toolkit NiuParser for Chinese syntactic and semantic analysis. It can handle a wide range of Natural Language Processing (NLP) tasks in Chinese, including word segmentation, partof-speech tagging, named entity recognition, chunking, constituent parsing, dependency parsing, and semantic role labeling. The NiuParser system runs fast and shows state-of-the-art performance on sever...
متن کاملChinese Web Scale Linguistic Datasets and Toolkit
The web provides a huge collection of web pages for researchers to study natural languages. However, processing web scale texts is not an easy task and needs many computational and linguistic resources. In this paper, we introduce two Chinese parts-of-speech tagged web-scale datasets and describe tools that make them easy to use for NLP applications. The first is a Chinese segmented and POS-tag...
متن کامل